Database searching by flexible protein structure alignment.

نویسندگان

  • Yuzhen Ye
  • Adam Godzik
چکیده

We have recently developed a flexible protein structure alignment program (FATCAT) that identifies structural similarity, at the same time accounting for flexibility of protein structures. One of the most important applications of a structure alignment method is to aid in functional annotations by identifying similar structures in large structural databases. However, none of the flexible structure alignment methods were applied in this task because of a lack of significance estimation of flexible alignments. In this paper, we developed an estimate of the statistical significance of FATCAT alignment score, allowing us to use it as a database-searching tool. The results reported here show that (1) the distribution of the similarity score of FATCAT alignment between two unrelated protein structures follows the extreme value distribution (EVD), adding one more example to the current collection of EVDs of sequence and structure similarities; (2) introducing flexibility into structure comparison only slightly influences the sensitivity and specificity of identifying similar structures; and (3) the overall performance of FATCAT as a database searching tool is comparable to that of the widely used rigid-body structure comparison programs DALI and CE. Two examples illustrating the advantages of using flexible structure alignments in database searching are also presented. The conformational flexibilities that were detected in the first example may be involved with substrate specificity, and the conformational flexibilities detected in the second example may reflect the evolution of structures by block building.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FATCAT: a web server for flexible structure comparison and structure similarity searching

Protein structure comparison, an important problem in structural biology, has two main applications: (i) comparing two protein structures in order to identify the similarities and differences between them, and (ii) searching for structures similar to a query structure. Many web-based resources for both applications are available, but all are based on rigid structural alignment algorithms. FATCA...

متن کامل

Sequence Alignment as a Database Technology Challenge

Sequence alignment is an important task for molecular biologists. Because alignment basically deals with approximate string matching on large biological sequence collections, it is both data intensive and computationally complex. There exist several tools for the variety of problems related to sequence alignment. Our first observation is that the term ’sequence database’ is used in general for ...

متن کامل

Fast Bayesian Shape Matching Using Geometric Algorithms

We present a Bayesian approach to comparison of geometric shapes with applications to classification of the molecular structures of proteins. Our approach involves the use of distributions defined on transformation invariant shape spaces and the specification of prior distributions on bipartite matchings. Here we emphasize the computational aspects of posterior inference arising from such model...

متن کامل

Identification of BKCa channel openers by molecular field alignment and patent data-driven analysis

In this work, we present the first comprehensive molecular field analysis of patent structures on how the chemical structure of drugs impacts the biological binding. This task was formulated as searching for drug structures to reveal shared effects of substitutions across a common scaffold and the chemical features that may be responsible. We used the SureChEMBL patent database, which prov...

متن کامل

Flexible Structural Neighborhood—a database of protein structural similarities and alignments

Protein structures are flexible, changing their shapes not only upon substrate binding, but also during evolution as a collective effect of mutations, deletions and insertions. A new generation of protein structure comparison algorithms allows for such flexibility; they go beyond identifying the largest common part between two proteins and find hinge regions and patterns of flexibility in prote...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Protein science : a publication of the Protein Society

دوره 13 7  شماره 

صفحات  -

تاریخ انتشار 2004